AITopics | drl algorithm

Collaborating Authors

drl algorithm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Ladder in Chaos: Improving Policy Learning by Harnessing the Parameter Evolving Path in A Low-dimensional Space Hongyao Tang

Neural Information Processing SystemsFeb-16-2026, 06:04:18 GMT

Deep Reinforcement Learning (DRL) is far from well understood, although its great potential has been demonstrated with a lot of achievements in different practical problems [Badia et al., 2020, Shah et al., 2022, Fawzi et al., 2022, Degrave et al., 2022, OpenAI, 2022]. Consistent efforts are made to gain a better understanding of the learning dynamics of RL agents.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling

Shannon McCurdy

Neural Information Processing SystemsFeb-15-2026, 00:01:12 GMT

While ridge regression provides shrinkage for the regression coefficients, manyofthecoefficients remain smallbutnon-zero. Performing ridgeregression with the matrix sketch returned by our algorithm and a particular regularization parameter forces coefficients to zero and has a provable(1+) bound on the statisticalrisk.

artificial intelligence, leverage score, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.47)

Industry: Health & Medicine (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)

Add feedback

29ef811e72b2b97cf18dd5d866b0f472-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 08:24:49 GMT

algorithm, equation, lnss, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Arizona (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

29ef811e72b2b97cf18dd5d866b0f472-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 08:24:43 GMT

algorithm, drl algorithm, lnss, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Arizona (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment > Games (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Non-Crossing Quantile Regression for Distributional Reinforcement Learning

Neural Information Processing SystemsDec-24-2025, 12:05:27 GMT

Distributional reinforcement learning (DRL) estimates the distribution over future returns instead of the mean to more efficiently capture the intrinsic uncertainty of MDPs. However, batch-based DRL algorithms cannot guarantee the non-decreasing property of learned quantile curves especially at the early training stage, leading to abnormal distribution estimates and reduced model interpretability. To address these issues, we introduce a general DRL framework by using non-crossing quantile regression to ensure the monotonicity constraint within each sampled batch, which can be incorporated with any well-known DRL algorithm. We demonstrate the validity of our method from both the theory and model implementation perspectives. Experiments on Atari 2600 Games show that some state-of-art DRL algorithms with the non-crossing modification can significantly outperform their baselines in terms of faster convergence speeds and better testing performance. In particular, our method can effectively recover the distribution information and thus dramatically increase the exploration efficiency when the reward space is extremely sparse.

distributional reinforcement learning, name change, non-crossing quantile regression, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)

Add feedback

Sum Rate Maximization in STAR-RIS-UAV-Assisted Networks: A CA-DDPG Approach for Joint Optimization

Huang, Yujie, Wan, Haibin, Li, Xiangcheng, Qin, Tuanfa, Li, Yun, Li, Jun, Chen, Wen

arXiv.org Artificial IntelligenceDec-2-2025

With the rapid advances in programmable materials, reconfigurable intelligent surfaces (RIS) have become a pivotal technology for future wireless communications. The simultaneous transmitting and reflecting reconfigurable intelligent surfaces (STAR-RIS) can both transmit and reflect signals, enabling comprehensive signal control and expanding application scenarios. This paper introduces an unmanned aerial vehicle (UAV) to further enhance system flexibility and proposes an optimization design for the spectrum efficiency of the STAR-RIS-UAV-assisted wireless communication system. We present a deep reinforcement learning (DRL) algorithm capable of iteratively optimizing beamforming, phase shifts, and UAV positioning to maximize the system's sum rate through continuous interactions with the environment. To improve exploration in deterministic policies, we introduce a stochastic perturbation factor, which enhances exploration capabilities. As exploration is strengthened, the algorithm's ability to accurately evaluate the state-action value function becomes critical. Thus, based on the deep deterministic policy gradient (DDPG) algorithm, we propose a convolution-augmented deep deterministic policy gradient (CA-DDPG) algorithm that balances exploration and evaluation to improve the system's sum rate. The simulation results demonstrate that the CA-DDPG algorithm effectively interacts with the environment, optimizing the beamforming matrix, phase shift matrix, and UAV location, thereby improving system capacity and achieving better performance than other algorithms.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2512.01202

Country: Asia > China (0.68)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling

Shannon McCurdy

Neural Information Processing SystemsNov-20-2025, 20:47:24 GMT

Ridge leverage scores provide a balance between low-rank approximation and regularization, and are ubiquitous in randomized linear algebra and machine learning. Deterministic algorithms are also of interest in the moderately big data regime, because deterministic algorithms provide interpretability to the practitioner by having no failure probability and always returning the same results.

artificial intelligence, data mining, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

A Long N-step Surrogate Stage Reward for Deep Reinforcement Learning Junmin Zhong Arizona State University Ruofan Wu Arizona State University Jennie Si Arizona State University

Neural Information Processing SystemsOct-8-2025, 08:19:09 GMT

DDPG (TD3) [7], have demonstrated their great potential. Contributions . 1) We introduce a new, simple yet effective surrogate reward They usually proceed as follows.

algorithm, equation, lnss, (14 more...)

Neural Information Processing Systems

Country: